Tracking in Reinforcement Learning

نویسندگان

  • Matthieu Geist
  • Olivier Pietquin
  • Gabriel Fricout
چکیده

Reinforcement learning induces non-stationarity at several levels. Adaptation to non-stationary environments is of course a desired feature of a fair RL algorithm. Yet, even if the environment of the learning agent can be considered as stationary, generalized policy iteration frameworks, because of the interleaving of learning and control, will produce non-stationarity of the evaluated policy and so of its value function. Tracking the optimal solution instead of trying to converge to it is therefore preferable. In this paper, we propose to handle this tracking issue with a Kalman-based temporal difference framework. Complexity and convergence analysis are studied. Empirical investigations of its ability to handle non-stationarity is finally provided.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Dynamic Target Tracking Algorithm for Image Based on Two-step Reinforcement Learning

In this article, we modeled image target tracking into reinforcement learning framework, and we proposed a two-step reinforcement learning algorithm for target tracking. In this algorithm, we set multiple tracker agent to track the pixel of target, the intention of reinforcement learning is to achieve tracking strategy of every tracker agent, we divided each learning step of tracker into two pa...

متن کامل

Repetitive Tracking Control of Nonlinear Systems Using Reinforcement Fuzzy-Neural Adaptive Iterative Learning Controller

This paper proposes a new fuzzy neural network based reinforcement adaptive iterative learning controller for a class of nonlinear systems. Different from some existing reinforcement learning schemes, the reinforcement adaptive iterative learning controller has the advantages of rigorous proofs without using an approximation of the plant Jacobian. The critic is appended into the reinforcement a...

متن کامل

Eye-Tracking Method’ Usage for Understanding the Cognitive Processes in Multimedia Learning

Introduction: Designing multimedia learning environments should consist of the evidence-based study and principals about the human learning process. Eye tracking is a way based on the learner processing of learning materials which presented in multimedia learning environments. The aim of the study was to examine the use of the eye-tracking method to investigate the cognitive processes in m...

متن کامل

Learning optimal switching policies for path tracking tasks on a mobile robot

A set of impedance controllers is used for both state estimation and tracking control on a mobile robot. State estimation is based on the states of a family of impedance controllers and tracking is implemented through a single controller from this set. Reinforcement learning techniques are used to create switching policies that optimize time or energy in a path tracking task.

متن کامل

Extrinsic Evaluation of Dialog State Tracking and Predictive Metrics for Dialog Policy Optimization

During the recent Dialog State Tracking Challenge (DSTC), a fundamental question was raised: “Would better performance in dialog state tracking translate to better performance of the optimized policy by reinforcement learning?” Also, during the challenge system evaluation, another nontrivial question arose: “Which evaluation metric and schedule would best predict improvement in overall dialog p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009